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Abstract — The  accurate  detection  of  lung  lesions  from 
computed  tomography  ( CT)  scans  is  essential  for  clinical 
diagnosis.  It  provides  valuable  information  for  treatment 
of  lung  cancer.  However , the  process  is  exigent  to  achieve 
a fully  automatic  lesion  detection.  Here , a novel 
segmentation  algorithm  is  proposed , it’s  an  improved 
toboggan  algorithm  with  a three -step  framework,  which 
includes  automatic  seed  point  selection,  multi -constraints 
lesion  extraction  and  the  lesion  refinement.  Then,  the 
features  like  local  binary  pattern  (LBP),  wavelet, 
contourlet,  grey  level  co-occurence  matrix  ( GLCM)  are 
applied  to  each  region  of  interest  of  the  segmented  lung 
lesion  image  to  extract  the  texture  features  such  as 
contrast,  homogeneity,  energy,  entropy  and  statistical 
extraction  like  mean,  variance,  standard  deviation, 
convolution  of  modulated  and  normal  frequencies. 
Finally,  support  vector  machine  (SVM)  and  K-nearest 
neighbour  (KNN)  classifiers  are  applied  to  classify  the 
abnormal  region  based  on  the  performance  of  the 
extracted  features  and  their  performance  is  been 
compared.  The  accuracy  of  97.8%  is  been  obtained  by 
using  SVM  classifier  when  compared  to  KNN  classifier. 
This  approach  does  not  require  any  human  interaction  for 
lesion  detection.  Thus,  the  improved  toboggan  algorithm 
can  achieve  precise  lung  lesion  segmentation  in  CT 
images.  The  features  extracted  also  helps  to  classify  the 
lesion  region  of  lungs  efficiently. 
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contourlet,  grey  level  co-occurrence  matrix(GLCM), 
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I.  INTRODUCTION 

Lung  cancer  is  the  frequent  cause  for  death  in  men  and 
second  most  in  women.  In  2012,  1.8  million  people  got 
lung  cancer  and  it  resulted  in  1.6  million  deaths  [1].  Early 
identification  helps  in  survival  benefit  improvement.  In 
United  States  about  17.4%  diagnosed  by  means  of  lung 
cancer  stay  alive  for  about  five  years  after  diagnosis, 


while  it’s  worse  in  developing  world  [1].  Profound 
analysis  of  radiographic  images  provides  information 
about  the  microenvironment  and  need  of  intra-tumoral 
heterogeneity  for  personalized  medicine  [2].  Testing  of 
huge  numbers  of  image  features  extracted  from  computed 
tomography  (CT)  with  high  throughput  provides 
information  about  spatial  and  temporal  genetic 
heterogeneity  by  non-invasive  way,  which  provides  better 
results  than  invasive  biopsy  which  helps  in  medical 
research,  computer-aided  diagnosis,  radiotherapy  and  also 
evaluations  of  surgery  results  [5].  For  this  purpose, 
accurate  segmentation  of  lung  lesions  is  needy.  One 
method  for  lung  lesion  segmentation  is,  radiologists 
describe  the  lesion  physically.  They  may  overrate  the 
lesion  volume  to  enclose  the  whole  lesion.  Different 
physical  description  is  also  varying.  So,  human 
interaction  must  be  avoided.  As  a result,  a highly 
proficient  and  automatic  lung  lesion  segmentation 
approach  is  required.  As  shown  in  Fig  1,  due  to  the 
diversity  of  lung  lesions,  segmentation  accuracy  is  poor. 
So,  to  increase  the  segmentation  accuracy  the  automatic 
segmentation  using  improved  toboggan  algorithm  is  used 
in  the  proposed  work. 

The  segmented  region  of  lung  lesion  does  not  produce  an 
accurate  boundary  because  of  the  similar  pixel  values 
between  the  lesion  and  adjacent  tissues  of  the  lesion  [6]. 
So,  the  performance  and  efficiency  is  greatly  dependent 
upon  the  feature  vectors  [7].  The  feature  vectors  which 
are  used  in  recent  recovery  and  classification  systems 
utilize  the  visual  information  of  the  image  such  as  shape 
[8],  texture  [9],  edges  [10],  color  histograms  [11],  etc. 
Mostly,  texture  based  image  descriptors  have  been  widely 
used  in  the  field  of  pattern  recognition  to  capture  the  fine 
details  of  the  image  [12].  The  number  of  operations 
required  to  compute  the  features  which  contain 
information  about  image  textural  characteristics  such  as 
homogeneity,  gray -tone  linear  dependencies  contrast, 
number  and  nature  of  boundaries  present  is  proportional 
to  the  number  of  resolution  cells  in  the  image  and  so  these 
features  are  quickly  computable  [13].  Thus,  in  the 
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proposed  work  the  features  like  local  binary  pattern, 
wavelet,  contourlet,  gray  level  co -occurence  matrix 
features  are  extracted  and  classified  to  obtain  the  efficient 
detection. 
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Fig.l:  Different  types  of  lung  lesions 


A.  Related  work 

Based  on  the  CT  data  of  lung,  many  researchers  have 
been  done  to  relevant  works  on  lesion  segmentation, 
feature  extraction  and  classification  . D.  M.  Campos  et  al 
[3]  proposed  a supervised  lung  nodule  segmentation 
which  is  based  on  shape,  contrast,  intensity  to  produce 
three  preliminary  segmentations  and  an  artificial  neural 
network  to  obtain  an  accurate  segmentation.  In  [4],  S. 
Diciotti  et  al  proposed,  an  automated  correction  method 
based  on  a local  shape  analysis  by  making  use  of  3-D 
geodesic  distance  map  representations.  The  advantage  is, 
the  nodule  segmentation  is  done  only  for  recognized 
vessel  attachments.  In  [12],  A novel  image  feature 
description  based  on  the  local  wavelet  pattern  (LWP)  was 
proposed  to  characterize  the  medical  CT  images  for 
content-based  CT  image  recovery.  Successfully,  the  LWP 
is  derived  for  each  pixel  of  the  CT  image  by  utilizing  the 
relationship  of  centre  pixel  with  the  local  neighboring 
information.  In  [13]  D.  Wu  et  al  proposed,  a stratified 
statistical  learning  that  detects  the  nodule  texture,  gray 
scale,  shape  and  curvature  which  was  built  to  provide  a 
reasonable  segmentation  for  the  later  classification.  In 
[14]  A.  a Farag  et  al  proposed,  a general  lung  nodule 
shape  model  using  implicit  spaces  where  the  shape  model 
is  fused  with  the  image  intensity  statistical  information. 
By  this  method  the  shape  of  nodule  is  analysed.  But,  the 
processing  time  is  high.  In  [15],  a novel  pathological  lung 
segmentation  method  was  proposed  where  fuzzy 
connectedness  algorithm  was  used  to  estimate  the  lung 
volume  and  texture  features  are  used  to  identify  the 
abnormal  imaging  patterns. 

In  order  to  perform  the  automatic  lung  lesion 
segmentation,  it  is  necessary  to  develop  an  automatic  and 
accurate  method  for  the  seed  point  selection.  Using  the 
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algorithm  of  [16]  the  error  caused  by  manual  analysis  was 
reduced  and  the  segmentation  accuracy  was  also 
improved.  In  [24],  the  problem  of  discrimination  between 
a bright  object  against  a dark  background  and  vice-versa 
inherent  in  LBP  and  LTP  has  been  solved.  The  clinical 
value  of  tumour  heterogeneity  is  been  measured  and 
compare  with  traditional  image  features.  They  are  applied 
to  the  trained  SVM  to  provide  more  accuracy  by 
classifying  the  extracted  features  [30].  But,  it  requires 
more  samples.  Accuracy  of  kNN  classifier  heavily 
depends  on  the  choice  of  k.  The  problem  of  estimating  ‘k’ 
for  any  test  point  becomes  difficult  due  to  several  factors 
like  the  local  distribution  of  training  points  around  that 
test  point,  presence  of  outliers  in  the  dataset,  and, 
dimensionality  of  the  feature  space,  so,  A dynamic  k 
estimation  algorithm  based  on  the  neighbor  density 
function  of  a test  point  and  class  variance  as  well  as 
certainty  factor  information  of  the  training  points  is 
proposed.  Performance  of  the  kNN  algorithm  with  the 
proposed  choice  of  k is  evaluated  and  found  improved 
[31].  But,  it  consumes  more  time. 

All  the  above  mentioned  methods  provide  feasible  ways 
for  lung  lesion  detection,  but  they  require  pre-processing 
and  also  the  number  of  human  interactions  was  not  clear. 
To  provide  fast,  accurate  and  reliable  lung  lesion 
detection  for  clinical  diagnosis  feature  extraction  and 
classification  has  been  to  overcome  all  the  above  issues. 

II.  METHODS 

The  proposed  method  is  a novel  method  helps  to  identify 
the  lesion  region  of  lung  in  CT  image.  The  proposed 
method  helps  to  achieve  the  automatic  selection  of  the 
lesion  seed  point,  which  is  one  of  the  lesion  segmentation 
problems.  Fig.  2 represents  the  overview  of  the  proposed 
method.  The  input  is  a slice  of  CT  lung  lesion  image.  The 
pre-processing  is  done  by  using  the  Otsu  thresholding 
process  to  remove  unwanted  signals  fromCT  lung  lesion 
image.  By  using,  the  improved  toboggan  algorithm  the 
seed  points  are  selected  initially  and  is  applied  to  segment 
the  lung  lesion  image  automatically  by  region  growing. 
Then,  the  lung  lesion  boundary  is  obtained  by 
smoothening  in  lesion  refining  stage.  From  the  segmented 
region  of  lung  lesion  the  features  like  LBP,  GLCM, 
wavelet,  contourlet  are  been  extracted  to  improve  the 
accuracy.  Then,  the  extracted  features  are  been  classified 
using  the  SVM  and  KNN  classifiers 
The  figure.2  shown  here  provides  the  flow  chart  of  the 
proposed  work 
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Fig. 2:  Block  diagram  of  proposed  segmentation 
algorithm 

A.  Pre-Processing 

Pre-processing  is  done  primarily.  Otsu  thresholding  is 
used  in  pre-processing  stage  to  remove  the  noise  from  the 
CT  lung  lesion  image  such  that  the  segmentation  process 
can  be  done  precisely. 

B.  lesion  segmentation  using  improved  toboggan 
algorithm 

Improved  toboggan  algorithm  helps  to  overcome  the  over 
segmentation  caused  by  the  conventional  toboggan 
approach  by  using  a new  back-off  iteration  for  calculating 
the  local  minima  pixels.  Multi-scale  Gaussian  convolution 
is  also  used  which  provides  image  gradient  changes  in 
different  directions.  By  using  improved  toboggan  method, 
the  highlighted  blood  vessels  and  noise  in  the  gradient 
image  are  moved  to  the  lower  value  of  the  lung  field  and 
the  lesion  remains  with  the  higher  value.  Thus,  the  lesion 
could  be  improved  in  the  label  image  for  the  automatic 
seed  point  selection. 

In  this  method  initially,  the  seed  points  for  the 
segmentation  process  are  selected  by  performing  an  back- 
off iteration.  The  small  regions  of  seed  point  with  the 
same  minimum  gradient  value  are  marked  by  the  same 
label.  The  toboggan  gradient  stack  is  used  which  helps  to 
store  the  local  minima  pixels.  Each  new  local  minimum 
pixel  obtained  from  the  gradient  magnitude  of  the  image 
will  be  compared  with  the  existing  elements  of  the  stack. 


Fig. 3:  Lesion  segmentation  using  improved  toboggan 
algorithm 


A value  in  the  stack  with  the  maximum  similarity  to  the 
new  local  minimum  would  be  marked  as  the  original 
source  pixel.  Thus,  the  new  minimum  pixel  value  will  be 
pushed  into  the  stack  and  used  to  label  the  source  pixel. 
By  using  distance  constraint  and  growing  degree 
constraint  the  iterative  growing  is  done  to  select  seed 
points.  Thus,  this  approach  successfully  reduces  the  over 
segmentation  [6].  The  intensity  of  blood  vessels  which  are 
close  to  the  lung  lesion  are  sometimes  considered  as  part 
of  the  adjacent  lesions  and  so  lung  lesion  refining  method 
is  used  to  eliminate  all  incorrect  vascularis ed  regions  and 
other  tissues. 

Lesion  refining  method  provides  more  accurate  lesion 
boundary  detection  by  smoothening  which  improves  the 
accuracy. 

C.  Feature  Extraction 
( i)Gray  level  co-occurrence  matrix 

A GLCM  g[i,j]  is  defined  by  specifying  displacement 
vector  d=(dx,dy).  GLCM  Measures  like,  Entropy  - 
Randomness  of  gray  level  distribution,  Energy -uniformity 
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of  gray  level  in  a Region  of  interest,  Contrast -Measure  of 
difference  between  gray  levels  of  pixels  and 
Homogeneity -Measure  of  similarity  of  texture  of  pixels. 
Assuming  that  the  gray  level  appearing  in  each  cell  is 
quantized  to  Ng  levels.  Let  lx  = {1,2,  ,Nx}  be  the 
horizontal  spatial  domain,  ly  = {1,2,.  ,NY}  be  the  vertical 
spatial  domain,  and  G = {1,2,  .*,Ng}  be  the  set  of  Ng 
quantized  gray  levels.  The  set  ly  x lx  is  the  set  of 
declaration  cells  of  the  image  ordered  by  their  row- 
column  designation.  The  image  I can  be  represented  as  a 
function  which  assigns  some  gray  level  in  G to  pair  of 
coordinates  in  ly  x lx;  I:  ly  xlxDG.  These  measures  are 
arrays  termed  angular  nearest-neighbour.  To  describe  the 
Gray  level  spatial-dependence  matrices  the  arrays  must  be 
highlighted  by  adjacent  or  nearest  neighbour  declaration 
cells.  We  consider  a declaration  cell  with  eight  nearest- 
neighbour  resolution. 


Fig.4:  Processing  of  gray  co-matrix  to  fill  the  values  in 
the  GLCM 

It  creates  a gray-level  co-occurrence  matrix  (GLCM) 
from  image  I.  Gray  co-matrix  creates  the  GLCM  by 
calculating  how  often  a pixel  value  with  gray -level  value  i 
occurs  horizontally  adjacent  to  a pixel  value  with  the 
value  j.  Each  element  (if)  in  glcm  specifies  the  number 
of  times  that  the  pixel  with  value  i occurred  horizontally 
adjacent  to  value  j.  If  I is  a binary  image,  which  scales 
the  image  by  two  gray -levels.  If  I is  an  intensity  of  an 
image,  gray  co-matrix  scales  the  image  to  eight  gray- 
levels.  Element  (1,1)  in  the  GLCM  contains  the  value  1 
because  there  is  only  one  occurrence  in  the  image  where 
two,  horizontally  adjacent  pixels  have  the  values  1 and  1. 
Element  (1,2)  in  the  GLCM  contains  the  value  2 because 
there  are  two  occurrences  in  the  image  where  two, 
horizontally  adjacent  pixels  have  the  values  1 and  2.  Gray 
co-matrix  continue  this  processing  to  calculate  all  the 
values  in  the  GLCM. 

(ii)  Local  Binary  Pattern 

LBP  is  invariant  to  monotonic  intensity  changes  of  pixel 
values.  Hence,  it  is  robust  to  elucidation  and  contrast 
variations.  LBP  is  the  meticulous  case  of  the  Texture 


Spectrum  model.  It  has  been  found  to  be  a commanding 
feature  for  texture  classification.  It  has  further  been 
determined  that  when  LBP  is  combined  with 
the  Histogram  of  oriented  gradients  (HOG)  descriptor, 
performance  is  improved  considerably.  The  LBP  feature 
vector,  in  its  simplest  form,  is  created  in  the  following 
manner: 

• Divide  the  examined  window  into  cells . 

• Each  pixel  in  a cell  is  compared  to  the  pixel  at 
each  of  its  8 neighbours 

• If  the  center  pixel's  value  is  greater  than  the 
neighbour’s  value,  its  represented  by  "0". 
Otherwise,  represented  as  "1".  This  gives  an  8- 
digit  binary  number. 

• Then,  the  histogram  is  been  computed  over  the 
cell. 

• Then,  normalize  the  histogram 

• This  gives  a feature  vector  for  the  entire  image. 
The  LBP  [38]  code  at  (x,  y)  is  calculated  as  follows: 

B-l 

LBP  x,  y = ^ S(Pb  - Pc)  2b  , 

B=  0 

S(Z)=  {1,  z>=  0 

{0,z<0 

Where,  pc  is  the  pixel  value  at  (x,  y),  ph  is  the  pixel  value 
estimated  pixel  value  using  bilinear  interpolation  from 
neighbouring  pixels  in  the  b- th  location  on  the  circle  of 
radius  R around  pc  and  B is  the  total  number  of 
neighbouring  pixels.  A 25-bin  histogram  is  been 
computed.  In  that  some  patterns  occur  more  often  than  the 
others.  The  number  of  state  transitions  between  0 and  1 
for  them  are  at  most  two  [24] .those  patterns  are  called 
uniform  patterns  and  others  are  non  uniform  patterns. 
Each  uniform  pattern  is  given  a bin  and  combining  all 
non-uniform  patterns  into  a single  bin,  the  bin  number  is 
reduced.  For  example,  if  B = 8,  it  is  reduced  from  256  to 
59. 

(iii)  contourlet  transform 

The  contourlet  transform  is  the  one  which  is  used  to  get 
smooth  contours  of  images  by  using  double  filter  bank 
structures.  The  contourlet  transform  provides  fast 
execution  based  on  a Laplacian  pyramid  breakdown 
followed  by  directional  filterbanks.  Here,  it  uses  a double 
filter  bank  structure  to  get  the  smooth  contours  of  images. 
In  double  filter  bank,  the  Laplacian  pyramid  (LP)  is  used 
initially  to  capture  the  point  discontinuities . If  there  is  any 
point  discontinuities  then  directional  filter  bank  (DFB)  is 
used  to  connect  all  those  discontinuities  into  linear 
structures.  The  Laplacian  pyramid  (LP)  breakdown 
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produces  only  one  band  pass  image  that  can  avoid 
frequency  scrambling  and  directional  filter  bank  (DFB) 
allows  only  high  frequency  signals  to  pass  through  its 
directional  sub  bands.  Thus,  DFB  with  LP provides  better 
multi  scale  decomposition  and  remove  the  low  frequency. 
Therefore,  image  signals  are  passed  through  LP  sub  bands 
to  get  band  pass  signals  and  also  pass  signals  through 
DFB  to  capture  the  directional  information  of  the  image. 
This  double  filter  bank  structure  with  the  combination  of 
LP  and  DFB  is  otherwise  called  as  pyramid  directional 
filter  bank  (PDFB) 

( iv)  Wavelet  transform 

Discrete  wavelet  transform  generates  useful  subsets  of 
frequency  components  (or)  scales  from  region  of  interest. 
It  is  based  on  the  fast  fourier  transformation.  It  creates 
high  dimensional  feature  vector.  So,  dimensionality 
reduction  is  necessary  and  two  methods  are  used  for  it. 
Feature  selection  and  feature  projection  methods  are  used 
for  dimensionality  reduction.  Wavelet  transforms  helps  to 
represent  the  time-frequency  illustration  for  continuous - 
time  signals  and  so  it  is  related  to  harmonic  analysis. 
Discrete  wavelet  transforms  uses  discrete-time  filter 
banks.  These  filter  banks  contains  either  finite  inpulse 
response  (FIR)  or  infinite  inpulse  response  (HR)  filters. 
The  continuous  wavelet  transform(CWT)  are  subjected  to 
the  uncertainty  principle  of  Fourier  analysis  with  respect 
to  sampling  theory  [12]. 

In  the  region  of  interest,  there  are  some  occurrences  in  it, 
so  it’s  difficult  to  assign  concurrently  an  exact  time  and 
frequency  response  scale  to  that  region  of  interest.  The 
product  of  the  uncertainties  of  time  and  frequency 
response  of  that  region  has  a lower  value.  Thus,  in  the 
continuous  wavelet  transform  scale  of  that  region,  such  an 
occurrence  marks  an  entire  region  in  the  time -scale  plane, 
instead  of  just  one  point. 


Wavelets  coefficients  approximations  aj  and  details  dj 
through  the  discrete  wavelet  transform  have  a mean  and 
variance  that  equals  to,  Mean  is  given  by, 

E(aj)  = 2i/2^ 

Variance  is  given  by, 

V(aj)  = (Hn/iA2)  A;crOA2 


where  pO  and  oO  are  the  mean  and  the  variance  of  the 
data,  j is  the  scale  of  the  wavelet  transform,  and  h and 
g are  scaling  and  wavelet  filters,  respectively. 

Table. 1:  statistical  measures  by  wavelet  transform 


FEATURES 

NORMAL 

IMAGE 

ABNORMAL 

IMAGE 

MEAN 

0.202 

0.286 

VARIANCE 

0.342 

2.348 

D.  Classification 
( i ) KNN  classifier 

KNN  classifier  is  one  of  the  most  basic  classifier  of 
pattern  recognition  for  classifying  the  features  extracted. 
A feature  extracted  region  is  classified  by  a majority  vote 
of  its  neighbours  [31].  The  classifiers  by  predicting 
nearest  neighbour  values  classify  the  features  extracted. 
The  important  issues  involved  in  training  this  classifier 
are, 

1.  Validating  ‘K’, 

2.  The  type  of  distant  metric  used  to 
classify 

To  make  a prediction  by  using  KNN  classifier  following 
steps  is  followed 

1.  Compute  the  distance  of  test  samples  with  all 
training  samples  considered. 

2.  Find  the  k nearest  vectors . 

3.  The  number  of  selected  nearest  neighbour  points 
should  be  much  lesser  than  the  original  data 
points  in  a class 

4.  Arrange  the  distance  measured  in  ascending 
order  and  choose  the  closest  label. 

In  our  proposed  method,  training  phase  and  testing  phase 
have  been  done  with  following  stages: 

Training  phase: 

1.  Training  images  are  placed  in  the  folder. 

2.  Read  the  training  sample 

3.  The  KNN  method  is  applied  to  extracted 
features  which  are  taken  as  training 
samples  and  tuned  fortesting  phase. 

Testing  phase: 

1.  Read  the  test  images. 

2.  KNN  is  applied  and  the  nearest  neighbors 
are  identified  using  the  Euclidean  distance 
function  using  the  training  data. 

3.  If  the  K neighbours  have  all  the  same 
labels  its  classification  is  stopped. 
Otherwise,  compare  distances  between  the 
K neighbours  and  construct  the  Euclidean 
distance  matrix. 
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4.  Thus,  KNN  classification  is  done 
successfully  and  the  output  is  displayed. 


Euclidean  distance,  d=  Ef=1(xt  — yt  )A2 


K-Nearest  Neighbor 


III.  RESULTS 

A.  Datasets 

Publicly  available  and  in-house  acquired  datasets  are  used 
to  evaluate  the  performance  of  segmenting  and  classifying 
the  lung  lesion  using  CT  image.  The  images  have  a slice 
thickness  which  ranges  from  1.25  to  2.50  mm  with  a 0.70 
mm  x 0.70  mm  resolution. 


Fig.4:  k-nearest  neighbour  searching 


B.  Human  Interaction 

In  lung  lesion  automatic  segmentation  method  using 
improved  toboggan  algorithm  there  is  no  human 
interaction  for  the  initial  seed  point  selection  and  also  for 
further  feature  extraction  and  classification  process. 

C.  Performance  and  Analysis  of  Proposed  method 

The  output  obtained  for  segmentation,  feature  extraction 
and  classification  of  lesion  is  been  shown  here. 


( ii)  SVM  classifier 

Support  vector  machines  are  used  to  analyse  the  data  for 
classification  and  regression  analysis.  It  is  done 
by  supervised  learning  models [25].  SVM  also  performs 
non-linear  classification  efficiently.  A support  vector 
machine  constructs  a hyperplane  which  is  used  for 
classification.. 

The  SVM  classifier  involves  two  stages.  They  are  training 
and  testing.  The  steps  involved  in  it  is  as  follows, 

1.  The  trained  samples  are  stored  in  a folder 

2.  The  kernel  function  is  defined  based  on 
trained  feature  vectors 

3.  It  is  used  to  measure  the  relative  nearness  of 
each  test  point  to  the  data  points 

4.  Thus,  hyperplane  is  formed  which  helps  to 
separate  the  normal  and  abnormal  region 

5.  Thus,  SVM  classification  is  done 
successfully  and  its  displayed  using  receiver 
operating  curve 

Fig. 5 shows  the  SVM  classifier  with  Hyperplane  where 
HI  does  not  separate  the  classes,  H2  does,  but  only  with  a 
small  margin.  H3  seperates  them  with  the  maximum 
margin. 


Fig. 5:  SVM  Classifier  with  Hyperplane 


(c)  (d) 


(e)  (f) 


(g)  00 
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The  proposed  method  has  an  automatic  lung  lesion  seed 
point(s)  selection,  lesion  extraction  lung  lesion  refinement 
stages  followed  by  feature  extraction  and  classification  of 
lung  lesion.  In  this  method  there  is  no  human  interaction. 
Feature  extraction  is  done  to  the  segmented  image  since  a 
high  degree  of  recognition  is  lacking  for  tissues  connected 
to  the  adjacent  lesion  with  similar  intensity  of  pixels. 
Then,  classification  is  done  to  classify  the  features 
extracted  using  SVM  and  KNN  classifiers.  Their 
performance  has  been  compared.  The  limitation  of  our 
work  is  the  sample  size  of  patient  population  is  less  and 
also  we  could  not  include  more  image  features  in  the 
training  because  of  the  limited  training  data.  So,  our 
future  work  will  focus  on  increasing  the  size  of  our  test 
samples  and  evaluate  the  performance 

V.  CONCLUSION 

In  this  paper,  an  automatic,  stable  and  quick-response 
automatic  segmentation  followed  by  feature  extraction 
and  classification  has  been  tested  for  both  public  and 
clinical  dataset.  The  initial  seed  points  were  first  detected 
using  an  improved  toboggan  method  for  lesion 
segmentation.  Then,  the  lesion  features  were  extracted  by 
using  LBP,  GLCM,  wavelet,  contourlet.  Then,  the 
classification  is  done  finally  using  the  SVM  and  KNN 
classifiers  and  their  performance  has  been  compared.  The 
important  component  of  this  work  is  that  it  does  not 
require  human  interactions  for  lesion  seed  point  detection. 
Compared  with  the  other  segmentation  methods  this 
shows  better  accuracy  and  provides  better  results  by 
classification. 


(k) 
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